图像分割中使用的数据并不总是在同一网格上定义。对于医学图像,尤其如此,在这种医学图像中,分辨率,视野和方向在各个渠道和受试者之间可能会有所不同。因此,图像和标签通常被重新采样到同一网格上,作为预处理步骤。但是,重采样操作引入了部分体积效应和模糊,从而改变了有效的分辨率并减少了结构之间的对比度。在本文中,我们提出了一个SPLAT层,该层自动处理输入数据中的分辨率不匹配。该层将每个图像推向执行前向通行证的平均空间。由于SPLAT运算符是重采样运算符的伴随,因此可以将平均空间预测拉回到计算损耗函数的本机标签空间。因此,消除了使用插值进行明确分辨率调整的需求。我们在两个公开可用的数据集上显示,具有模拟和真实的多模式磁共振图像,该模型与重新采样相比作为预处理步骤而改善了分割结果。
translated by 谷歌翻译
我们描述了Countersynth,一种诱导标签驱动的扩散变形的条件生成模型,体积脑图像中的标签驱动的生物合理的变化。该模型旨在综合用于下游判别判断性建模任务的反事实训练数据,其中保真度受数据不平衡,分布不稳定性,混淆或缺点的限制,并且在不同的群体中表现出不公平的性能。专注于人口统计属性,我们评估了具有基于体素的形态学,分类和回归条件属性的合成反事实的质量,以及FR \'{e} CHET开始距离。在设计的人口统计不平衡和混淆背景下检查下游歧视性能,我们使用英国Biobank磁共振成像数据来基准测试对这些问题的当前解决方案的增强。我们实现了最先进的改进,无论是整体忠诚和股权。 CounterSynth的源代码可在线获取。
translated by 谷歌翻译
脑磁共振图像(MRI)分割成解剖区域是神经影像动物中有用的任务。手动注释是耗时且昂贵的,因此具有全自动和通用的脑分割算法非常理想。为此,我们提出了一种基于修补的标签传播方法,基于具有潜在变量的生成模型。一旦训练,我们的分子化的图像标签(FIL)模型能够用各种图像对比标记目标图像。我们使用来自Miccai 2012 Grand Challenge和Multi-Atlas标签的研讨会的数据进行比较我们提出模型对最先进的效果。由于我们的方法旨在是通用目的,我们还评估它可以通过标记用不同MR对比度所获取的相同主题的图像来处理域移位。
translated by 谷歌翻译
目的:扫描间动作是$ r_1 $估计中的实质性源,可以预期在$ b_1 $字段更不均匀的地方增加7t。既定的校正方案不转化为7T,因为它需要体线圈参考。在这里,我们介绍了两种越优于既定方法的替代方案。由于它们计算它们不需要体内圈图像的相对敏感性。理论:所提出的方法使用线圈组合的幅度图像来获得相对线圈敏感性。第一方法通过简单的比率有效地计算相对敏感性;第二种通过拟合更复杂的生成模型。方法:使用变量翻转角度(VFA)方法计算$ R_1 $ MAP。在3T和7T中获取多个数据集,在VFA卷的获取之间,没有运动。 $ R_1 $ MAPS在没有修正的情况下,建议的校正和(在3T)与先前建立的校正方案。结果:在3T时,所提出的方法优于基线方法。扫描间运动人工制品也在7T下降。然而,如果还包含位置特定的发射现场效果,则再现性仅在没有运动条件的情况下融合。结论:提出的方法简化了$ R_1 $ MAPS的扫描间运动校正,并且适用于3T和7T,通常不可用。所有方法的开源代码都可公开可用。
translated by 谷歌翻译
We demonstrate a proof-of-concept of a large language model conducting corporate lobbying related activities. We use an autoregressive large language model (OpenAI's text-davinci-003) to determine if proposed U.S. Congressional bills are relevant to specific public companies and provide explanations and confidence levels. For the bills the model deems as relevant, the model drafts a letter to the sponsor of the bill in an attempt to persuade the congressperson to make changes to the proposed legislation. We use hundreds of ground-truth labels of the relevance of a bill to a company to benchmark the performance of the model, which outperforms the baseline of predicting the most common outcome of irrelevance. However, we test the ability to determine the relevance of a bill with the previous OpenAI GPT-3 model (text-davinci-002), which was state-of-the-art on many language tasks until text-davinci-003 was released on November 28, 2022. The performance of text-davinci-002 is worse than simply always predicting that a bill is irrelevant to a company. These results suggest that, as large language models continue to improve core natural language understanding capabilities, performance on corporate lobbying related tasks will continue to improve. We then discuss why this could be problematic for societal-AI alignment.
translated by 谷歌翻译
Variational autoencoders model high-dimensional data by positing low-dimensional latent variables that are mapped through a flexible distribution parametrized by a neural network. Unfortunately, variational autoencoders often suffer from posterior collapse: the posterior of the latent variables is equal to its prior, rendering the variational autoencoder useless as a means to produce meaningful representations. Existing approaches to posterior collapse often attribute it to the use of neural networks or optimization issues due to variational approximation. In this paper, we consider posterior collapse as a problem of latent variable non-identifiability. We prove that the posterior collapses if and only if the latent variables are non-identifiable in the generative model. This fact implies that posterior collapse is not a phenomenon specific to the use of flexible distributions or approximate inference. Rather, it can occur in classical probabilistic models even with exact inference, which we also demonstrate. Based on these results, we propose a class of latent-identifiable variational autoencoders, deep generative models which enforce identifiability without sacrificing flexibility. This model class resolves the problem of latent variable non-identifiability by leveraging bijective Brenier maps and parameterizing them with input convex neural networks, without special variational inference objectives or optimization tricks. Across synthetic and real datasets, latent-identifiable variational autoencoders outperform existing methods in mitigating posterior collapse and providing meaningful representations of the data.
translated by 谷歌翻译
We introduce Argoverse 2 (AV2) - a collection of three datasets for perception and forecasting research in the self-driving domain. The annotated Sensor Dataset contains 1,000 sequences of multimodal data, encompassing high-resolution imagery from seven ring cameras, and two stereo cameras in addition to lidar point clouds, and 6-DOF map-aligned pose. Sequences contain 3D cuboid annotations for 26 object categories, all of which are sufficiently-sampled to support training and evaluation of 3D perception models. The Lidar Dataset contains 20,000 sequences of unlabeled lidar point clouds and map-aligned pose. This dataset is the largest ever collection of lidar sensor data and supports self-supervised learning and the emerging task of point cloud forecasting. Finally, the Motion Forecasting Dataset contains 250,000 scenarios mined for interesting and challenging interactions between the autonomous vehicle and other actors in each local scene. Models are tasked with the prediction of future motion for "scored actors" in each scenario and are provided with track histories that capture object location, heading, velocity, and category. In all three datasets, each scenario contains its own HD Map with 3D lane and crosswalk geometry - sourced from data captured in six distinct cities. We believe these datasets will support new and existing machine learning research problems in ways that existing datasets do not. All datasets are released under the CC BY-NC-SA 4.0 license.
translated by 谷歌翻译
In this paper we derive a PAC-Bayesian-Like error bound for a class of stochastic dynamical systems with inputs, namely, for linear time-invariant stochastic state-space models (stochastic LTI systems for short). This class of systems is widely used in control engineering and econometrics, in particular, they represent a special case of recurrent neural networks. In this paper we 1) formalize the learning problem for stochastic LTI systems with inputs, 2) derive a PAC-Bayesian-Like error bound for such systems, 3) discuss various consequences of this error bound.
translated by 谷歌翻译
We demonstrate how efficient autonomous drone swarms can be in detecting and tracking occluded targets in densely forested areas, such as lost people during search and rescue missions. Exploration and optimization of local viewing conditions, such as occlusion density and target view obliqueness, provide much faster and much more reliable results than previous, blind sampling strategies that are based on pre-defined waypoints. An adapted real-time particle swarm optimization and a new objective function are presented that are able to deal with dynamic and highly random through-foliage conditions. Synthetic aperture sensing is our fundamental sampling principle, and drone swarms are employed to approximate the optical signals of extremely wide and adaptable airborne lenses.
translated by 谷歌翻译
Generative AI has matured to a point where large-scale models can generate text that seems indistinguishable from human-written text and remarkably photorealistic images. Automatically measuring how close the distribution of generated data is to the target real data distribution is a key step in diagnosing existing models and developing better models. We present MAUVE, a family of comparison measures between pairs of distributions such as those encountered in the generative modeling of text or images. These scores are statistical summaries of divergence frontiers capturing two types of errors in generative modeling. We explore four approaches to statistically estimate these scores: vector quantization, non-parametric estimation, classifier-based estimation, and parametric Gaussian approximations. We provide statistical bounds for the vector quantization approach. Empirically, we find that the proposed scores paired with a range of $f$-divergences and statistical estimation methods can quantify the gaps between the distributions of human-written text and those of modern neural language models by correlating with human judgments and identifying known properties of the generated texts. We conclude the paper by demonstrating its applications to other AI domains and discussing practical recommendations.
translated by 谷歌翻译